Predicting protein function from domain content

نویسندگان

  • Kristoffer Forslund
  • Erik L. L. Sonnhammer
چکیده

MOTIVATION Computational assignment of protein function may be the single most vital application of bioinformatics in the post-genome era. These assignments are made based on various protein features, where one is the presence of identifiable domains. The relationship between protein domain content and function is important to investigate, to understand how domain combinations encode complex functions. RESULTS Two different models are presented on how protein domain combinations yield specific functions: one rule-based and one probabilistic. We demonstrate how these are useful for Gene Ontology annotation transfer. The first is an intuitive generalization of the Pfam2GO mapping, and detects cases of strict functional implications of sets of domains. The second uses a probabilistic model to represent the relationship between domain content and annotation terms, and was found to be better suited for incomplete training sets. We implemented these models as predictors of Gene Ontology functional annotation terms. Both predictors were more accurate than conventional best BLAST-hit annotation transfer and more sensitive than a single-domain model on a large-scale dataset. We present a number of cases where combinations of Pfam-A protein domains predict functional terms that do not follow from the individual domains. AVAILABILITY Scripts and documentation are available for download at http://sonnhammer.sbc.su.se/multipfam2go_source_docs.tar

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Student Knowledge Level from Domain-Independent Function and Content Words

We explored the possibility of predicting the quality of student answers (error-ridden, vague, partially-correct, and correct) to tutor questions by examining their linguistic patterns in 50 tutoring sessions with expert human tutors. As an alternative to existing computational linguistic methods that focus on domain-dependent content words (e.g., velocity, RAM, speed) in interpreting a student...

متن کامل

P-30: The Effect of The T26248G Polymorphism on Putative MethyltransferaseNsun7 Protein Function and Its Role in Male Infertility

Background: Male infertility has many causes, including genetic infertility. The NOP2/Sun domain family, member7 (Nsun7) gene, which encodes putative methyltransferase Nsun7, has a role in sperm motility. The aim of the present study was to investigate the effect of the T26248G polymorphism on Nsun7 protein function and its role in male infertility. Materials and Methods: Semen samples were col...

متن کامل

In Silico Characterization of Proteins Containing ARID-PHD Domain and Its Expression in Aeluropus littoralis Halophyte

Abiotic stresses are the most important factors that reduce the yield of crops. In this case, Bioinformatics analysis plays an important role to study genes, and their relatedness as well as prediction their function in response to abiotic stresses. Among all domains, ARID-PHD domain has been identified in plants and animals and has a very significant role in growth regulation, cell cycle, and ...

متن کامل

Effects of T208E activating mutation on MARK2 protein structure and dynamics: Modeling and simulation

Microtubule Affinity-Regulating Kinase 2 (MARK2) protein has a substantial role in regulation of vital cellular processes like induction of polarity, regulation of cell junctions, cytoskeleton structure and cell differentiation. The abnormal function of this protein has been associated with a number of pathological conditions like Alzheimer disease, autism, several carcinomas and development of...

متن کامل

Dengue virus type-3 envelope protein domain III; expression and immunogenicity

Objective(s): Production of a recombinant and immunogenic antigen using dengue virus type-3 envelope protein is a key point in dengue vaccine development and diagnostic researches. The goals of this study were providing a recombinant protein from dengue virus type-3 envelope protein and evaluation of its immunogenicity in mice. Materials and Methods: Multiple amino acid sequences of different i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 24 15  شماره 

صفحات  -

تاریخ انتشار 2008